Predicting Extraction Performance using Context Language Models

نویسندگان

  • Eugene Agichtein
  • Silviu Cucerzan
چکیده

Exploiting lexical and semantic relationships in text can dramatically improve information retrieval accuracy. Most notably, named entities and relations between entities are crucial for effective question answering and other information retrieval tasks. Unfortunately, the success in extracting these relationships can vary for different domains and document collections. Predicting extraction performance is an important step towards integration of information extraction technology for high accuracy information retrieval. In this paper, we present a general language modeling method for quantifying the difficulty of information extraction tasks. We demonstrate the viability of our approach by predicting extraction performance of two real world tasks, Named Entity Recognition and Relation Extraction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Role of Self-efficacy, Self-esteem and Attitude in Predicting Writing Performance of Students in Ethiopian Context

The study aimed to investigate students’ self-efficacy, self-esteem, and attitude as determinants of their writing performance. The participants for the study were 373 South Gonder Zone Preparatory School students who were chosen using multistage sampling technique. Questionnaire and writing test were employed to gather data. Pearson’s Correlation technique was used to analyze the associations ...

متن کامل

Predicting of the Quality Attributes of Orange Fruit Using Hyperspectral Images

Background: Hyperspectral image analysis is a fast and non-destructive technique that is being used to measure quality attributes of food products. This research investigated the feasibility of predicting internal quality attributes, such as Total Soluble Solids (TSS), pH, Titratable Acidity (TA), and maturity index (TSS/TA); and external quality attributes such as color components (L*, a*, b*)...

متن کامل

Investigations Into Tandem Features

This report proposes and evaluates a number of tandem feature extraction schemes. The proposed schemes use confidence measures estimated from the MLP outputs to derive tandem-like features. The analysis of variance shows that the proposed features discriminate better between phone classes than conventional tandem features. But they become less discriminant as the HMM model become more complex i...

متن کامل

An Adaptive Approach to Named Entity Extraction for Meeting Applications

Named entity extraction has been intensively investigated in the past several years. Both statistical approaches and rule-based approaches have achieved satisfactory performance for regular written/spoken language. However when applied to highly informal or ungrammatical languages, e.g., meeting languages, because of the many mismatches in language genre, the performance of existing methods dec...

متن کامل

Grammar-based context-specific statistical language modelling

This paper shows how we can combine the art of grammar writing with the power of statistics by bootstrapping statistical language models (SLMs) for Dialogue Systems from grammars written using the Grammatical Framework (GF) (Ranta, 2004). Furthermore, to take into account that the probability of a user’s dialogue moves is not static during a dialogue we show how the same methodology can be used...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005